-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-4981 Add tests for ORDER BY, GROUP BY and salted tables using… #402
Conversation
I also had to bump up the spark version to 2.3.2 as this version has more sql support, in order to get the tests to pass. |
import static org.junit.Assert.assertEquals; | ||
import static org.junit.Assert.assertFalse; | ||
import static org.junit.Assert.assertTrue; | ||
import static org.junit.Assert.fail; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Looks like this diff is just due to reordering imports. Please refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
private HintNode.Hint hint; | ||
private boolean escapeCols; | ||
private boolean distinct; | ||
private int limit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make all member variables final and have the builder's build method set all values in a private constructor so that we follow the builder pattern more closely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO I don't think that necessary as we don't have an object that represents a Query, the build() just returns a a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok makes sense
List<KeyRange> splits = TestUtil.getAllSplits(conn, tableName); | ||
assertEquals(nGuidePosts, splits.size()); | ||
} | ||
|
||
@Test | ||
public void testGroupByWithAliasWithSameColumnName() throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this test case not applicable to phoenix-spark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query test joins. QueryBuilder currently doesn't have support to generate a join query over two tables.
phoenix-spark/pom.xml
Outdated
@@ -487,6 +487,16 @@ | |||
<testSourceDirectory>src/it/scala</testSourceDirectory> | |||
<testResources><testResource><directory>src/it/resources</directory></testResource></testResources> | |||
<plugins> | |||
<!-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this commented section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had commented this by mistake, fixed now.
catch(Exception e) { | ||
assertTrue(e.getMessage().contains(expectedPhoenixExceptionMsg)); | ||
} | ||
return rs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we ever want code to reach here? Or do we want to Assert.fail
if the exception doesn't occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should fail if an exception is not thrown, I fixed this.
sql="select count(*) from "+intTableName; | ||
QueryBuilder queryBuilder = new QueryBuilder(); | ||
queryBuilder.setSelectExpression("COUNT(*)"); | ||
queryBuilder.setFullTableName(intTableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can instead do method chaining here since you have a fluent interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
assertEquals(1, rs.getLong(1)); | ||
|
||
sql="select count(*) from "+intTableName + " where b.colb is null"; | ||
queryBuilder.setWhereClause("`B.COLB` IS NULL"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the only difference between this test and the one in phoenix-core is the backticks provided to querybuilder setter methods. I'm guessing this is a result of how SparkUtil
executes queries. Please correct me if I'm wrong, but if not, can we further reuse the code from the phoenix-core tests instead of having more duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a column name contains a dot that spark sql requires the backticks. Automatically generating the sql for this is difficult especially when columns as part of expressions etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok let's leave it the way it is for now.
String expectedPhoenixExceptionMsg, String expectedSparkExceptionMsg) { | ||
ResultSet rs = null; | ||
try { | ||
rs = executeQuery(conn, queryBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar question/comment about Assert.fail
here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Dataset phoenixDataSet = | ||
new PhoenixRDD(SparkUtil.getSparkContext(), tableName1, | ||
JavaConverters.collectionAsScalaIterableConverter(table1Columns) | ||
.asScala().toSeq(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some comments here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import java.util.Map; | ||
|
||
/** | ||
* Helper class to convert a List of Rows returns from a dataset to a sql ResultSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 'returned'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
|
||
public static ResultSet executeQuery(Connection conn, QueryBuilder queryBuilder, String url, Configuration config) | ||
throws SQLException { | ||
SQLContext sqlContext = new SQLContext(SparkUtil.getSparkContext()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks as though SQLContext
is deprecated. Quoting: Deprecated. Use SparkSession.builder instead. Since 2.0.0.
@twdsilva some comments and questions. Overall looks really good! |
49cf9e8
to
565b157
Compare
@ChinmaySKulkarni Thanks for the review, I have updated the PR please take a look. |
@twdsilva changes look good. Do we want to continue using |
@ChinmaySKulkarni I removed the use of the deprecated SqlContext constructor, please take a look. |
stmt.execute(); | ||
conn.commit(); | ||
|
||
// create two PhoenixRDDs using the table namea and columns that are required for the JOIN query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo
} | ||
|
||
public static SQLContext getSqlContext() { | ||
return SparkSession.builder().appName("Java Spark Tests").master("local[2]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Extract all these strings as static final member variables
@twdsilva A couple of minor nits, otherwise looks good to go. Thanks! |
854a71a
to
57ac77e
Compare
Thank @ChinmaySKulkarni , fixed the nits, will get this committed. |
57ac77e
to
91fc5f9
Compare
… phoenix-spark]
@karanmehta93 thanks for the review nice catch, I modified the SparkContext variable to be volatile.
@ChinmaySKulkarni can you please review? I refactored the AggregateIT, OrderByIT and SaltedIT so that it can be used to run queries using phoenix-spark. I created Base*IT based on these tests with two subclasses (one for phoenix , one for phoenix-spark). I added a QueryBuilder to generate a SQL query that is used to setup the spark sql query. I also added SparkResultSet that implements the JDBC interface so that the existing tests can be reused without much changes.